Batching CSIDH Group Actions using AVX-512
نویسندگان
چکیده
Commutative Supersingular Isogeny Diffie-Hellman (or CSIDH for short) is a recently-proposed post-quantum key establishment scheme that belongs to the family of isogeny-based cryptosystems. The protocol based on action an ideal class group set supersingular elliptic curves and comes with some very attractive features, e.g. ability serve as “drop-in” replacement standard curve protocol. Unfortunately, execution time prohibitively high many real-world applications, mainly due enormous computational cost underlying action. Consequently, there strong demand optimizations increase efficiency evaluation, which not only important CSIDH, but also related cryptosystems like signature schemes CSI-FiSh SeaSign. In this paper, we explore how AVX-512 vector extensions (incl. AVX-512F AVX-512IFMA) can be utilized optimize constant-time evaluation CSIDH-512 goal of, respectively, maximizing throughput minimizing latency. We introduce different approaches batching actions computing them in SIMD fashion modern Intel processors. particular, present hybrid technique that, when combined optimized (8 × 1)-way prime-field arithmetic, increases by factor 3.64 compared state-of-the-art (non-vectorized) x64 implementation. On other hand, vectorization 2-way aimed reduce latency makes our implementation about 1.54 times faster than state-of-the-art. To best knowledge, paper first demonstrate potential using instructions (resp. decrease latency) CSIDH.
منابع مشابه
Fast Sorting Algorithms using AVX-512 on Intel Knights Landing
The modern CPU’s design, which is composed of hierarchical memory and SIMD/vectorization capability, governs the potential for algorithms to be transformed into efficient implementations. The release of the AVX-512 changed things radically, and motivated us to search for an efficient sorting algorithm that can take advantage of it. In this paper, we describe the best strategy we have found, whi...
متن کاملA Novel Hybrid Quicksort Algorithm Vectorized using AVX-512 on Intel Skylake
The modern CPU’s design, which is composed of hierarchical memory and SIMD/vectorization capability, governs the potential for algorithms to be transformed into efficient implementations. The release of the AVX-512 changed things radically, and motivated us to search for an efficient sorting algorithm that can take advantage of it. In this paper, we describe the best strategy we have found, whi...
متن کاملComputing the Sparse Matrix Vector Product using Block-Based Kernels Without Zero Padding on Processors with AVX-512 Instructions
The sparse matrix-vector product (SpMV) is a fundamental operation in many scientific applications from various fields. The High Performance Computing (HPC) community has therefore continuously invested a lot of effort to provide an efficient SpMV kernel on modern CPU architectures. It has been shown that block-based kernels are helpful to achieve high performance, but also that they are diffic...
متن کاملELZAR: Triple Modular Redundancy using Intel AVX
Instruction-Level Redundancy (ILR) is a well known approach to tolerate transient CPU faults. It replicates instructions in a program and inserts periodic checks to detect and correct CPU faults using majority voting, which essentially requires three copies of each instruction and leads to high performance overheads. As SIMD technology can operate simultaneously on several copies of the data, i...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IACR transactions on cryptographic hardware and embedded systems
سال: 2021
ISSN: ['2569-2925']
DOI: https://doi.org/10.46586/tches.v2021.i4.618-649